Online Database of Quranic Handwritten Words
نویسندگان
چکیده
In this paper, an online Arabic handwritten words database is presented to be the first online Quranic handwritten words dataset. The dataset was collected naturally using Acer computer tablet 1.5 GHz core i3 by writing the words on a smooth touch screen using a special pen stylus. Here, a platform interface was designed to collect the handwritten words using Matlab environment. Handwritten words were chosen as the most common words repeated in the holly Quran. The initial version of (Quranic Handwritten Words) QHW database includes 120 handwritten words and divided equally into two sets to be written by 200 writers in total. The QHW database contains 12000 sample including more than 42,800 characters and 23,300 sub words. Handwritten words were mainly written by variety of people aged between 6 and 50 years from several countries. The main aim of creating this database is to be used in an online Arabic recognition system and make it available to other researchers while there is no public database can be use nowadays. Also, Some preprocessing steps those applied to standardize the all words of this database are presented in this paper.
منابع مشابه
Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank
The Quranic Arabic Dependency Treebank (QADT) is part of the Quranic Arabic Corpus (http://corpus.quran.com), an online linguistic resource organized by the University of Leeds, and developed through online collaborative annotation. The website has become a popular study resource for Arabic and the Quran, and is now used by over 1,500 researchers and students daily. This paper presents the tree...
متن کاملتشخیص دستنوشتۀ برخط فارسی با استفاده از مدل زبانی و کاهش قوانین نگارش کاربر
The Joint-up, cursive form of Persian words and immense variety of its scripts, also different figures of Persian letters depending on their sitting positions in the words, have turned the Persian handwritings recognition to an intense challenge. The major obstacle of the most often recognition ways, is their inattention to sentence contexture which causes utilizing of a word with correct appea...
متن کاملHolistic Farsi handwritten word recognition using gradient features
In this paper we address the issue of recognizing Farsi handwritten words. Two types of gradient features are extracted from a sliding vertical stripe which sweeps across a word image. These are directional and intensity gradient features. The feature vector extracted from each stripe is then coded using the Self Organizing Map (SOM). In this method each word is modeled using the discrete Hidde...
متن کاملComponent-based Segmentation of Words from Handwritten Arabic Text
Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words se...
متن کاملRecovery of temporal information of cursively handwritten words for on-line recognition
On-line recognition differs from off-line recognition in that additional information about the drawing order of the strokes is available. This temporal information makes it easier to recognize handwritten texts with an on-line recognition system. In this paper we present a method for the recovery of the stroke order from static handwritten images. The algon’thm was tested by classifying the wor...
متن کامل